wasm - fix debug info & misc #96

Luukdegram · 2024-01-03T05:50:03Z

No description provided.

When one or more symbols point to the same function (and body), we would previously write the same function body multiple times. Now we instead deduplicate them and point all aliased symbols to the same atom to ensure we emit a function and its body just once.

Rather than expensively iterating to te first atom to then iterate over all the atoms back to the end, we now simply start from the end and allocate the last atom as the first atom onwards. This simplifies the logic and we do not have to iterate atoms twice.

Previously we would only mark debug sections if they contained relocations that targeted a marked symbol. However, no debug sections would get parsed as they wouldn't be represented by exported symbols and therefore not get marked and parsed themselves. Now, we create synthetic symbols for all debug sections and ensure custom sections always get marked alive to ensure we emit them correctly.

The code -and function sections must match in order within the binary. Previously we would order the code section before writing them to disk. However, this meant they were already allocated and the offsets were based on the previous order. This meant that debug info was incorrect. We now order the atoms before allocation, and ensure synthetic functions are created after allocation, but appended correctly to the code section to ensure they are emit last, and therefore have correct offsets.

When calculating the function offset for its relocation we would previously use atom's offset with a fixed additional offset. However, we must include the size of the previous function's body which is LEB128- encoded. This means we cannot use a fixed-size offset to calculate the function body offset within the code section unless we use a fixed-size LEB size, which would increase the binary size. Instead, we simply re- calculate the atom's offset based on the currently written bytes during atom writing as we will not need this offset until we perform relocations for the debug section anyway.

Rather than ordering the atoms of the code section earlier, we simply skip it until we write the actual code section. This is possible because we don't need the know the offset of each atom until we perform the relocations of the debug sections, which we already delay to writing of those sections. Debug sections *must* always come after the module, including the code section. Therefore this is fine to relay on. We now do a lot less work and make the codebase simpler as well. This also allows us to remove the `next` field on atoms, reducing the memory usage every so slightly.

Luukdegram added 6 commits January 3, 2024 06:47

Luukdegram merged commit 3e692d9 into kubkon:main Jan 3, 2024
4 checks passed

Luukdegram deleted the wasm-fixes branch January 3, 2024 05:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wasm - fix debug info & misc #96

wasm - fix debug info & misc #96

Luukdegram commented Jan 3, 2024

wasm - fix debug info & misc #96

wasm - fix debug info & misc #96

Conversation

Luukdegram commented Jan 3, 2024